Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI golden tests for common command usage #537

Merged
merged 4 commits into from
Jul 17, 2019

Conversation

KtorZ
Copy link
Member

@KtorZ KtorZ commented Jul 12, 2019

Issue Number

#357

Overview

  • I have transformed the comments in the CLI module about usage into executable specifications

Comments

The idea is to get some sort of visual golden tests to help us keep track of the CLI looks like and if it's not getting out of hands.

@KtorZ KtorZ requested a review from piotr-iohk July 12, 2019 15:46
@KtorZ KtorZ self-assigned this Jul 12, 2019
@jonathanknowles jonathanknowles requested review from jonathanknowles and removed request for jonathanknowles July 15, 2019 07:37
Copy link
Member

@jonathanknowles jonathanknowles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a big improvement. 👍

Are you planning to make similar tests for exe/wallet/{http-bridge,jormungandr}/Main.hs?

I only noticed one thing: individual command functions still have comments. For example:

-- | cardano-wallet mnemonic generate [--size=INT]
cmdMnemonicGenerate :: Mod CommandFields (IO ())    

Are these necessary, now that we have tests? Perhaps they should also be removed.

@@ -262,8 +262,6 @@ runCli = join . customExecParser preferences

{-------------------------------------------------------------------------------
Commands - 'mnemonic'

cardano-wallet mnemonic generate [--size=INT]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to also remove the comments on functions like the one below:

-- | cardano-wallet mnemonic generate [--size=INT]
cmdMnemonicGenerate :: Mod CommandFields (IO ())         

It seems these are now superfluous, as we have:

["mnemonic", "generate", "--help"] `shouldShowUsage`  
    [ "Usage:  mnemonic generate [--size INT]"
    , "  Generate English BIP-0039 compatible mnemonic words."

@rvl
Copy link
Contributor

rvl commented Jul 15, 2019

I agree with the goal "don't let the CLI get out of control", but not fond of the mechanism used here.

Maintaining the golden strings in this way is going to be a pain.

Golden tests ensure that the help text won't change -- not even a single whitespace character is allowed to change. This is not what we want. We want a qualitative judgement about whether the CLI is sensible (i.e. "not getting out of hands").

Why don't we generate a man page style document from the CLI and upload it to the web site through CI. There are multiple ways of doing this, and the solution doesn't need to be highly sophisticated. Then QA can compare the man page between releases and assess whether it is suitable.

@jonathanknowles
Copy link
Member

@rvl wrote:

Maintaining the golden strings in this way is going to be a pain.

If we think it's a pain to maintain strings this way, then perhaps we can follow the approach used in hspec-golden-aeson, where expected results are stored in version-controlled files?

When running the tests, we compare each expected output with the contents of the appropriate file. If a given output is unexpected, then the test suite overwrites the appropriate file. On test failure, we can check trivially to see whether it's a "real" failure. If not, then we can commit the new versions and carry on.

What do you think?

@KtorZ
Copy link
Member Author

KtorZ commented Jul 15, 2019

@rvl @jonathanknowles

I tend to agree with @rvl on that. This is going to be cumbersome to maintain, and we need to make sure that we keep an eye on it actually. I already imagine people updating things without much considerations (by lack of time or priority ...).

Note that we already have an command-line guide on the wiki:

https://github.com/input-output-hk/cardano-wallet/wiki/Wallet-command-line-interface

It is not automatically generated, but we do maintain it and update it on every release, with QA validation on top. Perhaps this is enough in the end.

@KtorZ
Copy link
Member Author

KtorZ commented Jul 15, 2019

If we think it's a pain to maintain strings this way, then perhaps we can follow the approach used in hspec-golden-aeson, where expected results are stored in version-controlled files?

@jonathanknowles I don't think this solves our issue here, unless we do indeed manually check those files (which don't really happen with the API golden tests..). We do want to keep an eye on the output and make sure the CLI remain in-control. If we tests against auto-generated strings, it's not going to help much as we could simply auto-generate "gibberish" and validate against it.

@piotr-iohk
Copy link
Contributor

piotr-iohk commented Jul 15, 2019

@rvl, @jonathanknowles, @KtorZ

Maintaining the golden strings in this way is going to be a pain.

Isn't it just the matter of adjusting the test whenever you change something? In that case it shouldn't be a lot of additional work, or maybe I miss something?

I generally like the idea. In my mind such tests can give feedback immediately (on every pr or merge to master) in case something is wrong. Off course this would not throw away manual verification of the CLI however such verification is not done every day, (it's done rather opportunistically every once in a while or just before the release). Such tests would give potenital feedback automatically and much more often.

When running the tests, we compare each expected output with the contents of the appropriate file. If a given output is unexpected, then the test suite overwrites the appropriate file.

It sounds a bit unsafe to me as we can get some unwanted content to get merged and then test against it...

@jonathanknowles
Copy link
Member

@KtorZ

(which don't really happen with the API golden tests..)

I was under the impression that the tests fail if the output is different from the expected output? Perhaps I was mistaken? (Runs away to check.)

@piotr-iohk
Copy link
Contributor

@KtorZ

I tend to agree with @rvl on that. This is going to be cumbersome to maintain, and we need to make sure that we keep an eye on it actually. I already imagine people updating things without much considerations (by lack of time or priority ...).

I think that when you have a test that you also need to look at and adjust when making a change it gives you additional opportunity to consider the change actually... ¯_(ツ)_/¯

@KtorZ
Copy link
Member Author

KtorZ commented Jul 15, 2019

I was under the impression that the tests fail if the output is different from the expected output? Perhaps I was mistaken? (Runs away to check.)

They do, and it happens when we change API types. Then usually, golden samples get re-generated and merged. I believe people usually look at the generated golden tests when this happens and give a justification for the change. But, I haven't actually checked them in a while. There's no manual review on the output there which is ... okay for API tests since they really are there for alerting us in case we were to change the representation of an API type in a non-compatible way.

The CLI tests here have a rather different purpose: we want to be sure that our little optparse-applicative pieces, once composed together, look reasonable and / or consistent. In that sense, we are much more interested into the output and the way it looks.

@KtorZ
Copy link
Member Author

KtorZ commented Jul 17, 2019

What do we do about this one?

@rvl @piotr-iohk @jonathan.knowles?

@jonathanknowles
Copy link
Member

jonathanknowles commented Jul 17, 2019

What do we do about this one?

@rvl @piotr-iohk @jonathan.knowles?

While the solution used in this PR has some drawbacks, I still think it's an improvement over the status quo.

With the status quo, we have our specification in comments that are checked by neither the compiler nor our test suite.

With this, at least our specification is checked, even if it's not encoded in the most efficient fashion.

How about we go ahead with this PR, and then in future if we really find that it's cumbersome to update, then perhaps we can look at the problem again?

@KtorZ KtorZ force-pushed the KtorZ/cli-specification-usage branch from 1140e97 to 28dd9c0 Compare July 17, 2019 09:03
@KtorZ KtorZ force-pushed the KtorZ/cli-specification-usage branch from 28dd9c0 to c783a42 Compare July 17, 2019 10:26
Copy link
Contributor

@piotr-iohk piotr-iohk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@KtorZ KtorZ merged commit e752723 into master Jul 17, 2019
@KtorZ KtorZ deleted the KtorZ/cli-specification-usage branch July 17, 2019 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants